Mycobacteria functional screening and/or expression vectors

ABSTRACT

Recombinant screening, cloning and/or expression vector characterized in that it replicates in mycobacteria and contains 1) a mycobacteria functional replicon; 2) a selection marker, 3) a reporter cassette comprising a) a multiple cloning site (polylinker) b) a transcription terminator which is active in mycobacteria and is located upstream of the polylinker, and c) a coding nucleotide sequence derived from a gene coding for an expression, export and/or secretion protein marker, the nucleotide sequence being deprived of its initiation codon and its regulating sequences. This vector is used for identification and expression of exporter polypeptides, such as the Mycobacterium tuberculosis P28 antigen.

The Mycobacterium genus includes major human pathogens such as M. lepraeand M. tuberculosis, the agents responsible for leprosy andtuberculosis, which remain serious public health problems world-wide.

M. bovis and M. tuberculosis, the causative agents of tuberculosis, areintracellular facultative bacteria. Despite the major health problemslinked to these pathogenic organisms, little is known about theirexported and/or secreted proteins. In SDS-PAGE analyses of M.tuberculosis culture filtrate show at least 30 secreted proteins(1,19,38). Some of them have been characterized, their genes cloned andsequenced (7, 35, 37). Others, although they are immunodominant antigensof major importance for inducing protective immunity (2, 21), have notbeen completely identified. In addition, it is probable that a greatnumber of exported proteins remain attached to the cell membrane and,consequently, are not present in culture supernatants. It has been shownthat proteins located at the outer surface of various pathogenicbacteria, such as the 103 kDa Yersina pseudotuberculosis invasin (14) orthe 80 kDa Listeria monocytogenes internalin (10) play an important rolein interactions with the host cells and, consequently, in pathogenicityas in the induction of protective responses. Thus, a membrane-boundprotein could be important for M. tuberculosis infection as well as forthe induction of a protective response against this infection. Theseproteins could certainly be of interest for the preparation of vaccines.

The BCG (Bacille CalmetteGuérin), an avirulent strain derived from M.bovis, has been widely used as vaccine against tuberculosis. It is alsoa very important vector for the construction of live recombinantvaccines, particularly because of its high immunogenicity. Consequently,the study of the molecular biology of mycobacteria is currently of greatinterest.

The development of new vaccines against pathogenic mycobacteria, or theimprovement of available vaccines required the development of specifictools which make it possible to isolate or obtain immunogenicpolypeptide sequences.

The inventors have defined and produced, for this purpose, new vectorsallowing the screening of mycobacteria DNA sequences in order toidentify, among these sequences, nucleic acids encoding proteins ofinterest.

Vectors have been defined for evaluating the efficacy of sequences forregulation of expression in mycobacteria.

The invention also relates to new mycobacteria polypeptides which mayhave been isolated by means of the preceding vectors and capable ofentering into the production of compositions for the detection of amycobacteria infection, or for protection against an infection due tomycobacteria.

The subject of the invention is therefore a recombinant screening and/orcloning and/or expression vector, characterized in that it replicates inmycobacteria, in that it contains

1) a replicon which is functional in mycobacteria;

2) a selectable marker;

3) a reporter cassette comprising

a) a multiple cloning site (polylinker),

b) a transcription terminator which is active in mycobacteria, upstreamof the polylinker, and

c) a coding nucleotide sequence derived from a gene encoding a markerfor expression and/or export and/or secretion of protein, saidnucleotide sequence lacking its initiation codon and its regulatorysequences.

The marker for export and/or secretion is a nucleotide sequence whoseexpression followed by export and/or secretion depends on regulatoryelements which control its expression.

“Sequences or elements for regulation of expression” is understood tomean a promoter sequence for transcription, a sequence comprising theribosome-binding site (RBS), the sequences responsible for export and/orsecretion such as the sequence termed signal sequence.

A first advantageous marker for export and/or expression is a codingsequence derived from the PhoA gene. Where appropriate, it is truncatedsuch that the alkaline phosphatase activity is, nevertheless, capable ofbeing restored when the truncated coding sequence is placed under thecontrol of a promoter and of appropriate regulatory elements.

Other markers for exposure and/or export and/or secretion may be used.There may be mentioned by way of examples a sequence of the gene forβ-agarase or for nuclease of a staphylococcus or for β-lactamase of amycobacterium.

The transcription terminator should be functional in mycobacteria. Anadvantageous terminator is, in this regard, the T4 coliphage terminator(tT4). Other terminators appropriate for carrying out the invention maybe isolated using the technique presented in the examples, for exampleby means of the vector pJN3.

A vector which is particularly preferred for carrying out the inventionis the plasmid pJEM11 deposited at CNCM (Collection Nationale deCultures de Microorganismes in Paris—France) under the No. I-1375, onNov. 3, 1993.

For the selection or the identification of mycobacteria nucleic acidsequences encoding products capable of being incorporated intoimmunogenic or antigenic compositions for the detection of amycobacteria infection, the vector of the invention will comprise, inone of the polylinker sites, a nucleotide sequence from a mycobacteriumin which the presence of regulatory sequences is being sought which areassociated with all or part of a gene of interest making it possible,when the vector carrying these sequences (recombinant vector), isintergrated or replicates in a mycobacterium-type cellular host, toobtain the exposure at the level of the cell wall or membrane of thehost, and/or export and/or secretion of the product of expression of theabovementioned nucleotide sequence.

The mycobacteria sequence in question may be any sequence for whichattempts are made to detect if it contains elements for regulation ofexpression associated with all or part of a gene of interest and capableof allowing or promoting exposure at the level of the cell membrane of ahost in which it might be expressed, and/or export and/or secretion of aproduct of expression of a given coding sequence and, by way of test, ofthe marker for export and/or secretion.

Preferably, this sequence is obtained by enzymatic digestion of thegenomic DNA or of the DNA complementary to an RNA of a mycobacterium andpreferably of a pathogenic mycobacterium.

According to a first embodiment of the invention, the enzymaticdigestion of the genomic DNA or of the complementary DNA is carried outusing M. tuberculosis.

Preferably, this DNA is digested with an enzyme such as sau3A.

Other digestive enzymes such as ScaI, ApaI, ScaII, KpnI or alternativelyexonucleases or polymerases, may naturally be used, as long as theyallow fragments to be obtained whose ends may be inserted into one ofthe cloning sites of the polylinker of the vector according to theinvention.

Where appropriate, digestions with different enzymes will be carried outsimultaneously.

Preferred recombinant vectors for carrying out the invention are chosenamong the following recombinant vectors deposited at CNCM on Aug. 8,1994:

pExp53 deposited at CNCM under the No. I-1464

pExp59 deposited at CNCM under the No. I-1465

pExp410 deposited at CNCM under the No. I-1466

pExp421 deposited at CNCM under the No. I-1467.

The vectors of the invention may also be used to determine the presenceof sequences of interest, according to what was stated above, inmycobacteria such as M. africanum, M. bovis, M. avium or M. leprae whoseDNA or cDNA will have been treated with determined enzymes.

The subject of the invention is also a process for screening nucleotidesequences derived from mycobacteria, to determine the presence, in thesesequences, of regulatory elements controlling the expression, in acellular host, of nucleic acid sequences containing them, and/orexposure at the surface of the cellular host and/or export and/orsecretion of the polypeptide sequences resulting from the expression ofthe abovementioned nucleotide sequences, characterized in that itcomprises the following steps:

a) digestion of mycobacteria DNA sequences with at least one determinedenzyme and recovery of the digests obtained,

b) insertion of the digests into a cloning site, compatible with theenzyme of step a), of the polylinker of a vector above,

c) if necessary, amplification of the digest contained in the vector,for example by replication of the latter after insertion of the vectorthus modified into a determined cell, for example E. coli,

d) transformation of cellular hosts by the vector amplified in step c),or in the absence of amplification, by the vector of step b),

e) culture of the transformed cellular hosts in a medium allowingvisualization of the marker for export and/or secretion which iscontained in the vector,

f) detection of the cellular hosts which are positive for the expressionof the marker for exposure and/or export and/or secretion (positivecolonies),

g) isolation of the DNA of the positive colonies and insertion of thisDNA into a cell which is identical to that of step c),

h) selection of the inserts contained in the vector, which allow clonesto be obtained which are positive for the marker for export and/orsecretion,

i) isolation and characterization of the fragments of DNA ofmycobacteria which are contained in these inserts.

The carrying out of this process allows the construction of DNAlibraries containing sequences capable of being exported and/orsecreted, when they are produced in recombinant mycobacteria.

Step i) of the process may comprise a step for sequencing the insertsselected.

Preferably, the vector used is the plasmid pJEM11 (CNCM I-1375) and thedigestion is carried out by means of the enzyme sau3A.

According to a preferred embodiment of the invention, the screeningprocess is characterized in that the mycobacteria sequences are derivedfrom a pathogenic mycobacteria, for example from M. tuberculosis, M.bovis, M. avium, M. africanum or M. leprae.

The subject of the invention is also the nucleotide sequences ofmycobacteria selected after carrying out the process described above.

According to a specific embodiment of the invention, advantageoussequences are for example the mycobacteria DNA fragments contained inthe vectors pIPX412 (CNCM I-1463 deposited on Aug. 8, 1994), pExp53,pExp59, pExp410 or pExp421.

When the coding sequence derived from the marker gene for export and/orsecretion is a sequence derived from the PhoA gene, the export and/orsecretion of the product of the PhoA gene, truncated where appropriate,is obtained only when this sequence is inserted in phase with thesequence placed upstream, which contains the elements controlling theexpression and/or export and/or secretion which are derived from amycobacteria sequence.

The subject of the invention is also recombinant mycobacteria containinga recombinant vector described above. A preferred mycobacterium is amycobacterium of the M. smegmatis type.

M. smegmatis makes it possible, advantageously, to test the efficiencyof mycobacteria sequences for controlling the expression and/or exportand/or secretion of a given sequence, for example of a sequence encodinga marker such as alkaline phosphatase.

Another advantageous mycobacterium is a mycobacterium of the M. bovistype, for example the BCG strain currently used for vaccination againsttuberculosis.

A subject of the invention is, moreover, a recombinant mycobacterium,characterized in that it contains a recombinant vector defined above.

The invention also relates to a nucleotide sequence derived from a geneencoding an exported M. tuberculosis protein, characterized in that itis chosen from the following sequences:

a sequence IA corresponding to the chain of nucleotides described inFIG. 6A, or a sequence IB corresponding to the chain of nucleotidesdescribed in FIG. 6B, or hybridizing under stringent conditions withthese chains,

a sequence II comprising the chain of nucleotides IA or IB and encodingan M. tuberculosis P28 protein having a theoretical molecular weight ofabout 28 kDa and an observed molecular weight of 36 kDa, determined bydenaturing acrylamide gel electrophoresis (SDS-PAGE)

a sequence III contained in the sequence IA or IB and encoding apolypeptide recognized by antibodies directed against the M.tuberculosis P28 protein,

a sequence IV comprising the regulatory sequences of the gene comprisingthe coding sequence IA or IB,

a sequence V corresponding to the chain between nucleotides 1 and 72 ofthe sequence IA or IB and corresponding to the signal sequence,

a sequence VI corresponding to the chain between nucleotides 62 to 687of the sequence IA or IB,

a sequence VII corresponding to the chain between nucleotides 688 and855 of the sequence IA or IB.

Also entering within the framework of the invention is an M.tuberculosis polypeptide characterized in that it corresponds to theamino acid chain VIIIA or to the chain VIIIB represented in FIGS. 6A and6B respectively or in that it comprises one of these chains.

A preferred polypeptide is characterized in that it has a theoreticalmolecular weight of about 28 kDa determined according to the techniquedescribed in the examples.

The M. tuberculosis p28 protein has been characterized by its capacityto be exported and therefore potentially located across the bacterialplasma membrane or the cell wall. Furthermore, as shown in the sequencespresented in FIG. 6, some peptide units of the sequence are repeated.For these reasons, the M. tuberculosis p28 protein is now most oftendesignated as ERP protein and the gene containing the coding sequencefor this protein is called either irsa gene or erp gene.

The theoretical molecular weight of the ERP protein, evaluated at 28kDa, corresponds to an experimentally observed molecular weight of about36 kDa (electrophonetic migration on a denaturing polyacrylamide gel(DOS-PAGE)).

Another advantageous polypeptide within the framework of the inventioncomprises part of the amino acid chain VIII or VIIIB previouslydescribed and immunologically reacts with antibodies directed againstthe M. tuberculosis p28 protein.

Preferably, such a polypeptide is, in addition, characterized in that itdoes not immunologically react with the M. leprae p28 protein.

Particularly advantageous amino acid sequences within the framework ofthe invention are the sequences comprising one of the following chainsor corresponding to one of these chains in one or more copies: PGLTS(SEQID NO:1), PGLTD(SEQ ID NO:2), PGLTP(SEQ ID NO:3), PALTN(SEQ ID NO:4),PALTS(SEQ ID NO:5), PALGG(SEQ ID NO:6), PTGAT(SEQ ID NO:7), PTGLD(SEQ IDNO:8), PVGLD(SEQ ID NO:9).

Other advantageous sequences are, for example, the signal sequencebetween the positions of nucleotides 1 and 72 of the sequence of FIG. 6Aor 6B or alternatively the sequence between nucleotides 688 and 855which is capable of behaving like a transmembrane sequence.

These polypeptide sequences may be expressed in the form of recombinantpolypeptides. In these recombinant polypeptides, they may be replaced inpart especially as regards the sequences of 5 amino acids previouslydescribed, by sequences of interest obtained from mycobacteria or otherpathogenic organisms, it being possible for this replacement to lead tothe inclusion, inside the recombinant polypeptides, of the epitopes orthe antigenic determinants of a pathogenic organism or of a protein ofinterest against which it might be sought to obtain antibodies.

Thus, the polypeptides of the invention, while optionally exhibitingthemselves the antigenic or even immunogenic properties, may be used asadvantageous carrier molecules for preparing, where appropriate,vaccines having varying properties.

The subject of the invention is also monoclonal antibodies or polyclonalsera directed against a polypeptide as defined above.

As regards monoclonal antibodies, they are preferably directedspecifically against a polypeptide of the invention and do notrecognize, for example, the M. leprae p28 protein.

The subject of the invention is also a composition for the in vitrodetection of an M. tuberculosis infection, characterized in that itcomprises a polypeptide defined above, which is capable ofimmunologically reacting with antibodies formed in a patient infectedwith M. tuberculosis.

Another composition for the in vitro detection of an M. tuberculosisinfection is characterized by a nucleotide sequence containing at least9 nucleotides, which is derived from a sequence defined above, or anucleotide sequence containing at least 9 nucleotides and hybridizing,under stringent conditions, with M. tuberculosis DNA and nothybridizing, under the same conditions, with M. leprae DNA, thissequence being a DNA or RNA sequence, which is labeled whereappropriate.

The subject of the invention is also a prokaryotic or eukaryoticcellular host, characterized in that it is transformed by a nucleotidesequence as described in the preceding pages, under conditions allowingthe expression of this sequence and/or its exposure at the level of themembrane of the cellular host and/or its export and/or its secretionfrom the abovementioned membrane.

Preferably, the cellular hosts are mycobacteria such as M. smegmatis orM. bovis BCG.

Other cellular hosts are for example E. coli, CHO, BHK, Spf9/Baculoviruscells, yeasts such as Saccharomyces cerevisiae, vaccinia virus.

The subject of the invention is also an immunogenic compositioncomprising a polypeptide as presented above or a cellular host asdefined above.

The invention relates, moreover, to a vector for the screening and/orcloning and/or expression of nucleotide sequences which are functionalin mycobacteria, and which is derived from a vector described above andcharacterized in that the coding sequence derived from a gene encoding amarker for export and/or secretion is replaced by a reporter gene or areporter sequence.

Preferably, the reporter sequence or gene lacks its regulatorysequences, in particular its ribosome binding sequences and/or itssequences which allow the export and/or secretion of the marker producedwhen the vector is incorporated into a recombinant cellular host.

Preferably, the reporter sequence or gene contains the sequence encodingthe lacZ gene or a part of this sequence which is sufficient for thepolypeptide to exhibit a β-galactosidase activity.

A preferred vector of the invention is characterized in that itcomprises at one of the cloning sites of the polylinker, a chain ofnucleotides comprising a promoter and, where appropriate, regulatorysequences, for example for anchorage at the surface, the export or eventhe secretion of a polypeptide which might be produced under the controlof the promoter, for which it is desired to evaluate the capacity topromote or regulate the expression of a reporter nucleotide sequence inmycobacteria.

Preferred vectors are plasmids chosen from the plasmids pJEM12, pJEM13,pJEM14, or pJEM15 as represented in FIG. 12.

Such a vector may be used to evaluate the value of sequences forregulation of expression or of promoters, for example, the pAN, pblaF*,psul3, pgroES/EL1 sequences.

The invention also comprises a process for determining the activity of asequence containing at one of the cloning sites of the polylinker achain of nucleotides comprising a promoter and, where appropriate,regulatory sequences, for example for the exposure, export or evensecretion of a polypeptide which might be produced under the control ofthe promoter in mycobacteria, characterized in that it comprises thesteps of:

transforming a mycobacterium strain, for example M. smegmatis or M.tuberculosis, with a vector described above,

detecting the activity normally associated with the presence of thereporter gene or of the reporter sequence.

Other characteristics and advantages of the invention appear on readingthe examples which follow as well as in the figures.

LEGEND TO THE FIGURES

FIG. 1

Construction of pJEM11.

See Materials and Methods. pJEM11 has replication origins (ori) of E.coli and mycobacteria. It is therefore a shuttle plasmid. The selectablemarker is the kanamycin (Km) resistance gene. The truncated PhoA gene ofpPH07 (22) lacks a promoter, a start codon and a signal sequence; thusthe expression and export of PhoA depend on the translational fusionwith the amino-terminal ends of other proteins. The transcriptionalterminator (T) of the omega cassette avoids transcription by“read-through” using plasmid sequences.

FIG. 2

Construction of the plasmids pLA71, pLA72 and pLA73.

The insertion into the BamHI site of pJEM11 of BlaF* fragments (34) of 3different lengths lead to the expression of fusion proteins with thephoA activity. Colorimetric assays were carried out according to theBrockman and Heppel technique (8), with p-nitrophenyl phosphate assubstrate. The protein contents were measured with the aid of theBio-Rad assay. The arbitrary alkaline phosphatase units (aU) werecalculated as described in Materials and Methods.

FIG. 3

Western-blot analyses of PhoA fusion proteins.

Transformed M. smegmatis strains were cultured in Beck's mediumcontaining kanamycin (20 μg/ml). Total extracts of sonicated bacteriawere solubilized with SDS, resolved by SDS-PAGE and subjected toimmunoblotting. The preparation of the rabbit anti-PhoA serum has beenpreviously described (34). PhoA-coupled rabbit antibodies (Promega) and,as substrate, a mixture of X-P and nitro blue tetrazolium (BCIP-NBT,Promega) were used to reveal the PhoA fusions. Column 1: purifiedbacterial PhoA, M. smegmatis transformed by plasmids pJEM11: column 2,pLA71: column 3, pLA72: column 4, pLA73: column 5, pExp410: column 6,pExp53: column 7, pExp59: column 8, pExp421: column 9.

FIG. 4

Nucleotide sequences and deduced amino acid sequences of segments ofinserts selected from the plasmids pExp410, pExp53, pExp59 and pExp421.

The M. smegmatis clones with the alkaline phosphatase activity wereselected on X-P/kanamycin dishes. Their plasmids were amplified in E.coli XL-1 B, and the nucleotide sequence of the inserts determined asdescribed in Materials and Methods. A: pExp410 includes part of the 19kDa lipoprotein. The reading frame is maintained at the junction withphoA (BamHI/Sau3A). B: pExp53 includes part of a gene exhibitingsimilarities with the 28 kDa M. leprae antigen. The divergent aminoacids are in bold type. The codon for initiation of translation is GTG.The putative sites of cleavage by signal peptidase are indicated byarrows. C: pExp59 encodes a characteristic signal sequence. A putativeribosome-binding site (RSB) is underlined. The putative site of cleavageby signal peptidase is indicated by an arrow. D: pExp421 encodesconserved amino acid units conserved with proteins of the family ofstearoyl-acyl carrier protein (ACP) desaturases. R. comm: R. communis(ricin).

FIG. 5

The gene which is similar to the gene for the 28 kDa M. leprae antigenis present in a single copy in the M. tuberculosis genome.

The M. tuberculosis genomic DNA was extracted according to standardprocedures (27), digested with endonucleases PstI, SmaI, BstEII, SphI,BamHI and subjected to migration on a 1% agarose gel. The Southern-blothybridization was carried out according to standard procedures (27). The32P-labeled probe was a 180 bp PCR fragment of the pExp53 insert.

FIG. 6

Nucleotide sequence (IA and IB) and amino acid sequence (VIIIA andVIIIB) of the product of the IRSA gene encoding the M. tuberculosis P28protein (two variants are presented). This gene is now designated by theabbreviation “erp” corresponding to the expression “exported repetitiveprotein”.

FIG. 7

Preliminary nucleotide sequences flanking the M. tuberculosis IRSA gene.

FIG. 8

Bacteria genes for the regulation of iron (IRG's).

FIG. 9

Hydrophilicity profile of the M. leprae and M. tuberculosis P28PROT2INS.

FIG. 10

A) Alignment of the nucleotide sequences of the gene encoding the M.tuberculosis and M. leprae p28 proteins.

B) Alignment of the amino acid sequences of the M. tuberculosis and M.leprae p28 proteins.

FIG. 11

Construction of the plasmids pJN3 and pJN11.

Only the relevant genetic elements and restriction sites are shown. Theplasmids pRR3 and pJN1 have been described in the prior art (60) (58).The omega cassette was obtained by digestion of pHP45X with SmaI (59),followed by an agarose gel purification of a 2 kb fragment using theGeneclean kit (Bio 101 Inc.). Standard recombinant DNA techniques wereused in accordance with the description given in the state of the art(61). In pJN3 and pJN11, the β lactamase (bla) gene has beeninterrupted. oriE and oriM designate the replication origins of pUC (E.coli) and of pAL5000 (mycobacteria), respectively.

FIG. 12

Structure of the plasmids of the pJEM series.

(A) In the schematic representation of the plasmids, only the relevantgenetic elements are indicated. pJEM15 resulted from the cloning, intothe ScaI site of pRR3, i) of a fragment obtained by PCR amplification(using OJN1: 5′-AAGCTTCCGATTCGTAGAGCC-3′(SEQ ID NO:10) and OJN2:5′-GGGCTCGAGCTGCAG TGGATGACCTTTTGA-3′SEQ ID NO:11) as primers; and pJN11as template) and containing tT4 and the N-terminal end of cII; ii) ofthe synthetic oligonucleotides corresponding to MCS1; and iii) theHindIII-DraI lacZ′ fragment of pNM480. pJEM12-13-14 were obtained bycloning the PCR-amplified fragment described above, into the ScaI siteof pRR3. The synthetic oligonucleotides corresponding to MCS2 were theninserted. Finally, each of the three forms of the pNM480 series wereintroduced into the HindIII site in MCS2. (B) Nucleotide sequences ofthe regions between the OJN1 primer and the 8th lacZ′ codon (marked****). These sequences were checked experimentally. The tT4 region isunderlined and the synthetic RBS is in bold type. The amino acidsequence of the N-terminal end of cII is given under the DNA sequence.The HindIII sites are marked by an asterisk because they are not unique.For additional descriptions, see the legend in FIG. 11.

EXAMPLES

I) Identification of Genes Encoding Exported M. tuberculosis Proteins

The results reported here describe the definition, for mycobacteria, ofa genetic method of identification of exported proteins. Thismethodology is based on the translational fusion with bacterial alkalinephosphatase (PhoA). Such fusion proteins must be exported in order tohave the PhoA activity (6, 13, 16). A PhoA gene was used after deletionof the promoter region, of the ribosome-binding site and of the entireregion encoding the signal sequence whose codon for initiation oftranslation was used. Thus, the alkaline phosphatase activity isdependent on the translational fusion achieved in the correct readingframe with part of an exported protein. The construction of a phoAplasmid vector for mycobacteria is described first of all since it hasbeen shown that the introduction, into this vector, of the gene for theexported M. fortuitum β-lactamase (blaF*) (34) leads to the production,in M. smegmatis, of fusion proteins having the PhoA enzymatic activity.A library of sequences for fusion between the M. tuberculosis genomicDNA and the phoA gene was then constructed. Twelve independent clones,which exported fusion proteins, were isolated. Among them, it waspossible to identify the 19 kDa exported lipoprotein already describedin M. tuberculosis, a new M. tuberculosis sequence exhibitingsimilarities with the 28 kDa M. leprae protein, a protein comprisingconserved amino acid residues with stearoylacyl carrier protein (ACP)desaturases, and other new sequences.

Materials and Methods

Bacterial Strains, Plasmids, and Culture Conditions

The bacterial strains and the plasmids used in this study are presentedin Table 1. The growth of E. coli and M. smegmatis strains, theelectroporation, the screening on agar containing 20 μg/ml of kanamycinand 20 μg/ml of 5-bromo-4-chloro-3-indolyl phosphate (X-P) wereperformed as previously described (14)

M. tuberculosis, an isolate from a patient (strain 103), was cultured onsolid Lowënstein-Jensen medium.

Manipulation and Sequencing of DNA

Manipulation of DNA and Southern-blot analyses were carried out with theaid of standard techniques (27). For the determinations of thesequences, the oligonucleotides (5-GGCCCGACGAGTCCCGC-3′(SEQ ID NO:12)and 5′-TTGGGGACCCTAGAGGT-3′(SEQ ID NO:13)) were developed for sequencingacross the fusion junctions of the M. tuberculosis inserts in pJEM11(see below). The double-stranded plasmid DNA sequences were determinedby the dideoxy chain termination method (28) using the T7 sequencing kit(Pharmacia) according to the manufacturer's instructions, or with theTaq Dyc Deoxy Cycle Terminator sequencing kit (Applied Biosystems), on aGeneAmp 9600 PCR system (Perkin Elmer), and passed over a DNA analysissystem—Model 373 (Applied Biosystems).

Analyses of the Databanks

The nucleotide sequences were compared with those of the EMBL andGeneBank databanks using the FASTA algorithm (23) and the derivedprotein sequences were analyzed to determine a possible similarity withthe sequences contained in the databanks for the PIR and SwissProtproteins using the BLAST algorithm (1).

Constructions of the Plasmids

pJEM11: The construction of pJEM11 is summarized in FIG. 1. Briefly,pJEM2 was constructed using the shuttle plasmid pRR3 of E.coli-mycobacteria (26), by insertion of the truncated lacZ fragment ofpNM480 (18), a multiple cloning site or polylinker (MCS), and thetranscriptional terminator of the omega cassette (24). The N-terminalEcoRV-KpnI fragment of lacZ is replaced with the truncated phoA fragmentof pPHO7 (11), without initiation codon or signal sequence to givepJEM10. Finally, a potential initiation codon in the MCS was eliminatedin order to give pJEM11.

pLA71, pLA72 and pLA73: Fragments of blaF* (34) of different length,obtained by PCR amplification, were inserted at the Bam H1 site ofpJEM11 to give pLA71, pLA72 and pLA73 (FIG. 2). The oligonucleotides(Genset, Paris) used for the PCR amplification were, upstream,5′-CGGGATCCTGCTCGGCGGACTCCGGG-3′(SEQ ID NO:14) and, downstream,5′-CGGGATCCGGTCATCGATCGGTGCCGCGAA-3′(SEQ ID NO:15),5′-CGGGATCCCGCCGTGCTCGGCCATCTGCAG-3′(SEQ ID NO:16) and5′-CGGGATCCAGAGTAAGGACGGCAGCACCAG-3′(SEQ ID NO:17), for pLA71, pLA72 andpLA73 respectively. The PCR amplifications were carried out in a DNAThermal Cycler (Perkin Elmer), using Taq polymerase (Cetus), accordingto the manufacturer's recommendations.

Construction of the M. tuberculosis Genomic Libraries

M. tuberculosis genomic DNA was extracted according to standardprocedures (27). This DNA was partially digested with Sau3A (with 1 Uper 2 μg) at 37° C. for 2 min 30 sec. The digestion was stopped by theaddition of phenol. This DNA was then run on low-melting point agarose(Gibco, BRL). The fraction containing the fragments having from 400 to2,000 bp was extracted with agarase (GELase, Epicentre Technologies) andligated into the compatible Bam HI site of pJEM11 with T4 DNA ligase(Boehringer Mannheim), at 16° C. overnight.

Assay of Alkaline Phosphatase

For the assays of alkaline phosphatase, M. smegmatis was cultured in Lbroth supplemented with 0.05% tylaxopol (Sigma) at 37° for 48 h. Thealkaline phosphatase activity was assayed by the Brockman and Heppelmethod (8), in sonicated extracts as previously described (34), usingp-nitrophenyl phosphate as substrate for the reaction. The proteincontents were measured with the aid of the Bio-Rad assay (Bio-Rad). Thealkaline phosphatase activity is expressed in arbitrary Units(aU)=OD₄₂₀×105×1 g of protein⁻¹×min⁻¹.

Preparations of Antibodies, SDS-polyacrylamide Gel Electrophoresis andImmunoblottings

The preparation of a rabbit anti-PhoA serum has been previouslydescribed (34). Cellular extracts of M. smegmatis were prepared bysonication, SDS-PAGE and immunoblotting were performed as previouslydescribed (36).

Results

Construction of a Shuttle Plasmid Vector (pJEM11) for the Production ofFusion Proteins with PhoA in M. smegmatis

pJEM11 has a truncated phoA gene of E. coli without initiation codon orany regulatory elements (FIG. 1). The multiple cloning site allows theinsertion of fragments derived from genes encoding putative exportedproteins at the same time as their regulatory elements. Thus, fusionproteins were able to be produced, they expressed the activity ofalkaline phosphatase when the fusion was exported. pJEM11 is an Ecoli/mycobacteria shuttle plasmid which includes the gene for resistanceto the antibiotic kanamycin of tn903 as selectable marker.

Insertion of Genetic Elements Responsible for the Expression and Exportof β-lactamase in pJEM11 Lead to the Production of PhoA Fusion ProteinsWhich are Enzymatically Active in M. smegmatis

The three plasmids were constructed by insertion of fragments ofdifferent length derived from the β-lactamase gene of the overproducingstrain M. fortuitum D316 (blaF*) (34) at the BamHI site of pJEM11 (FIG.2). In pLA71, the 1384 bp fragment includes the promoter, the segmentencoding the 32 amino acids of the signal sequence, and the first 5amino acids of the mature protein (there is no Shine-Dalgarno sequencefor ribosomal attachment in the original sequence of blaF*). pLA72carries a 1550 bp fragment including the elements encoding the signalsequence and the first 61 amino acids of the mature protein. In pLA73,the 2155 bp fragment contains the whole blaF*. These plasmids were usedto transform M. smegmatis and the transformants were screened for theenzymatically active PhoA fusions by plating on agar media containingkanamycin and X-P. X-P is soluble and is colorless, but after cleavageof the phosphate with alkaline phosphatase, a blue precipitate isproduced. Thus, alkaline phosphatase-producing clones could be easilyidentified by their blue color. The expression of pLA71, 72 and 73 in M.smegmatis, leads to blue colonies, whereas colonies with pJEM11 remainedwhite. Western-blot analyses showed the production of phoA fusionproteins with an apparent molecular weight of about 47.5 kDa, 54 kDa and76 kDa, for pLA71, pLA72 and pLA73 respectively (FIG. 3, column 3, 4,5). These molecular weights are in agreement with the length of themature protein fused with alkaline phosphatase (apparent MW of 46 kDa,FIG. 3, column 1). In pJEM11, there is no expression of PhoA, asexpected (FIG. 3, column 2). The assay of the alkaline phosphataseactivity (see FIG. 2) of these bacteria confirms the expression of anenzymatic activity with the 3 pLA constructs. However, M. smegmatis withpLA73 expresses an activity which is about twice as high compared withpLA73 and 72. In separate experiments, we have confirmed that theintracellular production of phoA under the control of a mycobacterialpromoter, without fusion with an exported protein, was not associatedwith the expression of the alkaline phosphatase activity. All theseresults indicate that in this system, the activity of alkalinephosphatase depends on the translational fusion and the actual export ofthe product. Consequently, pJEM11 is suitable for the geneticidentification of the proteins exported by mycobacteria.

Construction in M. smegmatis of a Bank of phoA Fusions with M.tuberculosis Genomic DNA Fragments

The genomic DNA of a clinical isolate of M. tuberculosis was purifiedand partially digested with Sau3A. The 400/2,000 bp fraction wasinserted at the compatible BamHI site of pJEM11. The ligation productswere transferred into E. coli XL-1 blue by electroporation to obtain anamplification stage. About 2,500 clones containing plasmids with insertsgrew on an agar medium containing kanamycin. The plasmids purified fromthe transformants were combined and transferred by electroporation intoM. smegmatis MC²155. The transformed bacteria were plated on Lagar-kanamycin-X-P. About 14,000 clones were obtained. After incubatingfor 4 days, the first blue, and therefore PhoA⁺, colonies were observed.Each day, the dishes were checked, and new PhoA⁺ colonies were isolated.The cloned colonies were lyzed, and their DNA introduced byelectroporation into E. coli XL-1 blue, for the preparations ofplasmids. In all, 12 different inserts allowing the expression of phoAwere isolated and sequenced. Three sequences had similarities with knownsequences.

Fusion of PhoA with the Gene for the 19 kDa M. tuberculosis Lipoprotein

One of the plasmids (pExp410) has an insert corresponding to part of thegene for the 19 kDa protein already known. This gene encodes an exportedlipoprotein (5, 31). FIG. 4A shows the DNA sequence corresponding to thefusion between this gene and phoA. As expected, the same reading frameis maintained between the two proteins. The expected molecular weight ofthe fusion protein, according to the sequence, is thought to be close to57 kDa. However, the true molecular weight observed by Western-blotanalysis is identical to the purified PhoA protein (FIG. 3, column 1 and6), which suggests that the fusion protein is cleaved near the PhoAjunction.

Fusion with a Sequence Similar to the Gene for the 28 kDa M. lepraeProtein

The 28 kDa M. leprae protein is a major antigen which is very oftenrecognized by the sera from patients suffering from the lepromatous formof leprosy (9). In the M. tuberculosis insertion bank prepared, asequence carried by a recombinant vector (pExp53), exhibiting 77%similarity with the nucleotide sequence of this gene and 68% for thededuced amino acid sequence (FIG. 4B), was identified. In Western-blotanalysis, the molecular weight of the fusion protein is about 52 kDa(FIG. 3, column 7), which provides for about 45 amino acids of themycobacterial protein in the fusion protein, after cleavage of thesignal peptide. This is in conformity with the length of the fragment ofthe M. tuberculosis gene fused with phoA (FIG. 4B).

Southern-blot analyses of the M. tuberculosis genomic DNA were carriedout. It was shown that a 180 bp fragment of the 2 kb insert of theplasmid pExp53 does not contain any restriction site for theendonucleases PstI, SmaI, BamHI, BstEII and SphI. This fragment wasamplified by PCR. The M. tuberculosis genomic DNA was digested with theaid of these enzymes, and probed with the 32P-labeled PCR fragment. Ascan be seen in FIG. 5, only one band was observed when the genomic DNAwas digested with each of the five enzymes, which suggests that the geneis present in only one copy in the M. tuberculosis genome.

Other PhoA Fusions Carrying the Putative Signal Sequences

FIG. 4C shows the sequence of an insert carried by a recombinant vector(pExp59) fused with phoA. It has a typical signal sequence allowing theexport of proteins. The sequence presented is in conformity with theusual rules as established in Gram-negative bacteria (25). It containstwo positively charged amino acids (Arg, Asn) after the initiationcodon, followed by a hydrophobic peptide, with a Gly, probablycorresponding to a loop in the three-dimensional structure of thepeptide. A potential site of cleavage by signal peptidase is indicatedby an arrow, which gives a fusion protein with a molecular weight closeto that of phoA, as shown in FIG. 3, column 8, conformably.

PhoA Fusion Proteins with Amino Acid Units Conserved with Stearoyl-acylCarrier Protein (ACP) Desaturases

The ACP-desaturases are enzymes involved in the pathways for thebiosynthesis of fatty acids. In particular, these enzymes are integralmembrane proteins (29). Analyses of the plasmid pExp421 of the preparedbank showed two amino acid units conserved with ACP-desaturases, one of9 amino acids and the second of 14 amino acids (FIG. 4D). The rest ofthe sequence did not show any significant similarity with knownproteins.

Discussion

More than 30 secreted proteins have been found in BCG or M. tuberculosisfiltrates in the short term, with a minimum lysis of the bacterium (1,19, 38). These proteins have been classified according to theirmolecular weight and their immunological reactivities. Some werecharacterized more extensively. For example, the secreted proteins ofthe complex of antigen 85 (antigens 85 A, B and C) are 32 kDa proteinsexhibiting serological cross-reactions (7, 35). The antigens 85 A and 85B exhibit an affinity toward fibronectin and might be involved in theinternalization of M. tuberculosis in the macrophages. The genes forthese immunogenic proteins (7), and for 23 kDa proteins (MPB64) (37) andfor 19 kDa proteins (5) have been cloned and sequenced and sequences ofsignal peptides characteristic of exported proteins have been found. Therecombinant proteins produced using these genes are thought to bevaluable tools for the serological diagnosis of tuberculosis. Superoxidedismutase (SOD) of 23/28 kDa is abundant in short term culturefiltrates, and are thought to be involved in the survival ofmycobacteria in the phagolysosome. The gene encoding SOD in M.tuberculosis has been cloned and sequenced (39). Advantageously, nocharacteristic signal peptide sequence has been found. This suggests aspecific route for secretion of this enzyme by mycobacteria. Secretedproteins in two narrow molecular weight ranges (6-10 kDa and 26-34 kDa)are major T cell antigens (3) and induce, in mice, T cell immuneresponses which are protective against a challenge with livemycobacteria of the M. tuberculosis complex (4). It has been suggestedthat the differences in the immune responses observed between live andkilled bacteria are due to these exported/secreted proteins (20). Thesevarious preliminary results suggest that a better characterization ofexported/secreted proteins of pathogenic bacteria of the M. tuberculosiscomplex might be highly useful both for understanding theirpathogenicity and for developing new vaccines.

While secreted proteins have been studied by biochemical methods, othergenetic methodologies might prove necessary. Using a truncated phoAgene, fusion systems have been developed which allow the attachment ofthe amino ends of other proteins onto PhoA. This approach is based onthe E. coli periplasmic bacterial alkaline phosphatase. This enzyme mustbe located extracytoplasmically to be active. Thus, alkaline phosphatasemay be used as subcellular localization probe.

A PhoA methodology has been developed and described here for theidentification of proteins exported by mycobacteria. The insertion ofblaF* into pJEM11 leads to the production, in M. smegmatis, of fusionproteins with alkaline phosphatase activity. Furthermore, PhoA fusionswith 3 different fragments of BlaF* were enzymatically active, whichsuggests that most of the fusions in phase with exported proteins willhave a PhoA activity.

A bank of M. tuberculosis inserts in pJEM11 has been constructed andexpressed in M. smegmatis. In this bank, part of the gene encoding theknown exported lipoprotein of 19 kDa (pExp410) has been isolated. ThisM. tuberculosis protein is one of the serologically immunodominantantigens found in this bacillus. Analyses of the DNA sequence of thegene encoding this antigen indicate that the hydrophobic NH2-terminalregion is a lipoprotein signal peptide (5). Part of this lipoprotein hasbeen fused with the outer surface A protein of Borrelia burgdorferi toconstruct a recombinant BCG vaccine capable of inducing a high immuneresponse (31).

Two other sequences sharing similarities with the exported or membraneproteins have also been identified:

pExp53 was shown to exhibit similarities with the gene for the 28 kDa M.leprae antigen. This M. leprae antigen has been found by screening a λgt11 library with serum from patients suffering from the lepromatous formof leprosy. It is a major antigen involved in the humoral immuneresponse to M. leprae (9). Advantageously, it has been shown that apeptide of 20 amino acids of this protein exhibits considerablesimilarity with a peptide of the 19 kDa M. tuberculosis antigen, and itis an epitope of T cells exhibiting cross-reactions (12). The DNAsequence of the gene encoding the 28 kDa M. leprae antigen suggests that“the abovementioned amino acid sequence of the protein contains apotential signal peptide at its amino-terminal end and two longhydrophobic domains, which suggests that it is screened for localizationon the bacterial plasma membrane or the cell wall” (9).

A fusion protein encoded by a plasmid of our bank (pExp421) is thoughtto share amino acid units with desaturases. The ACP-desaturases areenzymes involved in the pathways of the biosynthesis of fatty acids. Ingeneral, these enzymes are integral membrane proteins (39). This resultsuggests that it is possible to have isolated part of a gene which isimportant in the metabolism of lipids in M. tuberculosis, maybe involvedin the lipid cell wall biosynthesis pathway.

Another plasmid (pExp59) with a characteristic putative signal sequencehas been found.

In conclusion, the results presented demonstrate that the technology ofPhoA for the genetic identification of exported proteins may besuccessfully adapted for M. tuberculosis. Preliminary screenings of aninsert bank giving PhoA fusion proteins have revealed sequencesexhibiting similarities with known exported proteins.

II) Expression of the P28 M. tuberculosis Protein

BCG is a live vaccine. It is the only vaccine used to protect againsttuberculosis. Its efficacy has proved variable according to thepopulations vaccinated, ranging from about 80% in Great Britain to 0% inIndia. It therefore seems essential to search for a more effectivevaccine. Moreover, the use of a live vaccine currently poses problemsbecause of the extension of the AIDS epidemic.

Several studies have shown that antigens exported by Mycobacteriumtuberculosis, the agent for tuberculosis, had a protective effectagainst a challenge with the virulent strain. The studies reported hereconsisted in using a genetic method for isolating and studying the M.tuberculosis genes encoding exported proteins. We describe here theisolation and characterization of a gene encoding a protein havinghomologies with the 28 kDa Mycobacterium leprae protein alreadydescribed.

Methodology for the Cloning of Genes Encoding Exported Proteins

The methodology presented in detail in part I is based on the use oftranslational fusions with the gene encoding the Escherichia colialkaline phosphatase, PhoA. Such fusion proteins have a detectablealkaline phosphatase activity only if they are exported. A plasmidvector carrying a phoA gene lacking its promoter, its ribosomalRNA-binding site and its signal sequence was constructed. Using thisvector, a PhoA activity can be observed only after translational fusionin the correct reading frame with an exported protein. The vector,called pJEM11 has a replication origin for E. coli and another formycobacteria. It also has a selectable marker, the kanamycin-resistancegene of the transposon Tn905. A multiple cloning site precedes thetruncated phoA gene.

A genomic DNA library obtained from an M. tuberculosis strain (Mt103)isolated from a tuberculosis patient was constructed in pJEM11 byinserting DNA fragments derived from a partial hydrolysis by the enzymeSau3a. The clones selected made it possible to identify a nucleotidefragment of the 28 kDa M. tuberculosis gene homologous to the geneencoding the 28 kDa M. leprae protein.

In the lepromateous patients, antibodies directed against this 28 kDaprotein are observed, suggesting that this protein is an immunodominantantigen. It was hypothesized that in M. tuberculosis, the 28 kDa proteinpossessing homologies with the 28 kDa M. leprae protein could also be animmunodominant antigen and that it could serve in the construction ofspecific immunological tests allowing the detection of the tuberculosisinfection or of the tuberculosis disease. It could perhaps be used forthe construction of subunit vaccines in different vaccine preparations.Furthermore, it could be useful as vector for the expression of antigensin mycobacteria for the construction of recombinant vaccines.

Cloning and Sequencing of the Gene Encoding a 28 kDa M. tuberculosisProtein

Using the insert contained in the plasmid pExp53 as probe, the wholegene encoding the 28 kDa M. tuberculosis protein was cloned by colonyhybridization of an M. tuberculosis DNA library constructed by insertingM. tuberculosis DNA fragments of between 2 and 6 kb in size, obtained bytotal hydrolysis with the enzyme PstI into the vector pBluescript KS-.The M. tuberculosis PstI fragment corresponding to the positive cloneand comprising a 4.1 kb insert was sequenced. FIG. 10 shows thenucleotide sequence of the fragment and the similarities with the geneencoding the 28 kDa M. leprae protein. The sequence of the 28 kDa M.tuberculosis gene is, like that of M. leprae, preceded by a sequencepossessing similarities with the “iron” boxes found upstream of thegenes expressed during an iron deficiency. An iron deficiency situationis encountered during growth in vivo. It is hypothesized that theexpression of this gene is induced during the growth, in themacrophages, of the mycobacteria harboring this gene. Furthermore, the28 kDa M. tuberculosis protein possesses, in its central part, tworegions containing units of 5 amino acids repeated in tandem, which areabsent from the homologous M. leprae protein. Analogous repeatedstructures have been previously identified in major antigens present atthe surface of other bacterial or parasitic pathogenic agents such asthe M protein (40) of the Streptococcacea and the CS protein of thePlasmodiae (41).

All or part of the 28 kDa M. tuberculosis protein, whose gene sequenceis presented here, could be a potential protective antigen for theconstruction of a tuberculosis vaccine. Such an antigen may be obtainedby purification from cellular extracts of M. tuberculosis or fromcellular extracts of genetically recombined heterologous organisms.Furthermore, the 28 kDa M. tuberculosis protein, or peptides derivedtherefrom, could be an antigen capable of being used in ELISA tests forscreening tuberculosis patients.

By using the 28 kDa M. tuberculosis gene as probe, hybridization underconditions of high stringency was observed only with the genomic DNA ofstrains belonging to the M. tuberculosis complex. Consequently, thesequence corresponding to the 28 kDa M. tuberculosis gene is a specificsequence which may be used for tests for detection of the tuberculosisbaccilli, using DNA or RNA probes and in vitro methods of geneamplification.

The regulatory region and the 28 kDa M. tuberculosis gene may be used ascarrier molecules to express heterologous antigens in BCG or any othermycobacterial vector useful for the construction of vaccines.

III) Expression of Mycobacteria Genes; Evaluation of DifferentExpression Promoters

An important aspect of the results obtained relates to the constructionof genetic tools for studying the expression of genes in mycobacteria.Regulatory sequence-probe vectors have been used in the prior art toisolate and analyze regulatory sequences in a large number of bacteria(54). The definition, by the inventors, of such tools specific tomycobacteria facilitates the study of the genetic mechanisms regulatingvirulence in the pathogenic species, and the isolation of new regulatorysequences which might be useful for developing improved recombinant BCGvaccines.

Initially, the expression of mycobacterial genes was studied inheterologous systems, Escherichia coli and Streptomyces lividans (46)(51) (60). These analyses suggest that most of the mycobacterial genesare more efficiently expressed in S. lividans than in E. coli.Subsequently, vectors based on mycobacterial plasmids were constructedwhich might be used for studies in homologous systems. The vectorspYUB75 and pYUB76 were designed to select gene fusions with a truncatedEscherichia coli lacZ gene (42). The plasmid pSD7 allows theconstruction of fusions of operons with a gene for chloramphenicolacetyltransferase (CAT) without promoter (47). By using these vectors anumber of mycobacterial regulatory sequences were isolated and evaluatedboth in E. coli and in Mycobacterium smegmatis.

The inventors have described other constructions of vectors of the pJEMseries, which have several advantages: they carry a transcriptionterminator, suitable multiple cloning sites, and they allow fusions bothof operons and of genes with lacZ. lacZ was chosen as reporter genebecause the enzyme encoded, β-galactosidase, remains active whenheterologous sequences are fused with its amino-terminal end (45) (64).Its activity may be easily measured in vitro, even at very low levelswith the aid of fluorescent compounds (48). β-Galactosidase is alsohighly immunogenic. It induces both humoral and cellular immuneresponses after presentation to the mouse immune system by recombinantbacteria (44) (56). Thus, β-galactosidase may also be used as reporterof the immunogenicity of a recombinant vaccine. By using pJEM vectors,new regulatory sequences active in BCG could be isolated and therecombinant BCG strains easily tested for their capacity to induceimmune responses in mice.

A comparative study of the activities of various promoters in M.smegmatis and BCG was also made. The results suggest that the RNApolymerases of M. smegmatis and of BCG do not share the samespecificity.

The construction of pJEM vectors. Ideally, a plasmid vectorpromoter-probe should contain five elements:

i) a replicon, ii) a selectable marker, and a reporter cassettecontaining iii) a transcription terminator followed iv) by multiplecloning sites (MCS) and v) a reporter gene lacking its regulatorysequences.

To construct a promoter cloning vector, mycobacteria, the repliconderived from the plasmid pAL5000 of Mycobacterium fortuitum, and thekanamycin resistance gene (aph) of Tn903 (58) were used. These geneticelements are basic components of most plasmids currently used for thetransformation of mycobacteria. They appear to confer high stability ontransformed clones of M. smegmatis and M. bovis BCG both in vitro and invivo (in mice) even in the absence of selection by antibiotics (56). Tofacilitate the preparation and manipulation of episomal DNA, most ofthese plasmids also contain an E. coli replicon. Thus, we chose theplasmid pRR3, an E. coli-mycobacteria shuttle vector which containsthese three genetic elements as basic vector (58).

No mycobacterial transcription terminator has yet been characterized. Toexamine if the T4 coliphage transcription terminator (tT4) was active astermination site for the mycobacteria RNA polymerases, the megainterposon (57) was cloned into the plasmid pJN3, upstream of thesRBS-cII-lacZ element, generating pJN11 (FIG. 11). The omega fragment iscomposed of a streptomycin/spectinomycin resistance gene flanked byshort inverted repeats containing tT4. The insertion of omega into a DNAfragment leads to termination of the synthesis of RNA in E. coli (57).pJN3 was constructed by cloning, into the ScaI site of pRR3, a cassettecomposed of a truncated lacZ combined with a synthetic RBS (sRBS) andthe 5′ end of the lambda phage cII regulatory gene and the pL promoter(FIG. 11). M. smegmatis mc2155 (61) was transformed with pJN3(pL-sRBS-cII-lacZ) or pJN11 (pL-X-sRBS-cII-lacZ) by electroporation andthe transformant clones were identified after growth on LB-Xgal plates.The transformant clones carrying pJN3 gave blue colonies and thetransformant clones carrying pJN11 gave white colonies. Theβ-galactosidase activity in M. smegmatis (pJN11) was 50 times as low asthat in M. smegmatis (pJN3) (Table 2). Thus, tT4 contained in the insertX acts as an efficient transcription terminator in M. smegmatis.

A DNA fragment containing the tT4 segment followed by the sRBS-cII-lacZelement of pJN11 was synthetized in vitro by amplification by PCR and anMCS (MCS1), containing 6 unique restriction sites, was added. Theresulting cassette was then cloned into the ScaI site of pRR3, givingthe operon fusion vector pJEM15 (FIG. 12). The electroporation of M.smegmatis MC²155 and of BCG with this plasmid led to white colonies onLB-Xgal plates with a very weak β-galactosidase activity (Table 2). Onthe other hand, in E. coli, pJEM15 expressed a higher β-galactosidaseactivity, and consequently a blue color on LB-Xgal plates. This isprobably due to its high copy number. In E. coli, pUC vectors arepresent at a high copy number (greater than 500), whereas inmycobacteria, the replicon-derived plasmids pAL5000 have a copy numberof approximately 3 to 10 (50). The testing of DNA fragments for promoteractivity, with the aid of pJEM15, by blue-white screening, should thusbe carried out directly in mycobacteria.

To obtain vectors allowing fusions of genes with lacZ, we followed asimilar strategy. The three forms of truncated lacZ of the pNM480 series(55), which differ from each other in the “placing in translationalphase” of a HindIII site located at its 5′ end, were cloned, downstreamof tT4 and of an MCS (MCS2) containing 7 unique restriction sites, intothe ScaI site of pRR3. The resulting plasmids pJEM12-13-14 (FIG. 12)thus allow the cloning of a wide range of restriction fragments in phasewith lacZ.

Evaluation of various promoters in M. smegmatis and BCG. Operon fusionsbetween the cII-lacZ reporter cassette of pJEM15 and the promoters pAN(56), pblaF* (63), psul3 (52) and pgroES/EL1 (49) were constructed. Theactivity of these promoters was evaluated in M. smegmatis and in M.bovis BCG. The first three promoters were isolated from mycobacterialspecies: pblaF* is a high expression mutant of pblaF, which directs theexpression of the M. fortuitum β-lactamase gene; pAN is an M.paratuberculosis promoter and psul3 a component of a mobile geneticelement of M. fortuitum Tn610. These promoters were localized on thebasis of the mapping of sites of initiation of transcription (pblaF* andpAN) or by deletion analysis (psul3) (62). pgroES/EL1 is a Streptomycesalbus promoter which regulates the expression of the groES/EL1 operon,and is active both in M. smegmatis and BCG (65).

The cloning experiments were carried out directly in M. smegmatis. DNAfragments containing each of the promoters were isolated and inserted atMCS1 of pJEM15 disgested with the appropriate restriction enzymes. Theresulting ligation mixtures were used to transform M. smegmatis mc2155by electroporation and blue colonies were selected in order toelectroduce E. coli MC1061 (45) as described above (43). The plasmidswere isolated from these E. coli clones and analyzed. Thosecorresponding to the desired constructs pJN29 to pJN32 (table 2) wereused for the electroporation of BCG (Pasteur strain).

The β-galactosidase activity was assayed on sonicated extracts of M.smegmatis and of BCG (table 2). The activity of the promoters variedconsiderably both between the promoters in a mycobacterial host andbetween the hosts for each promoter. The relative strength of thesepromoters was not the same in M. smegmatis and BCG. Although pblaF* wasthe most powerful promoter both in M. smegmatis and in BCG, thesituation is different for the other promoters: pAN and pgroES/EL1 weremore active than psul3 in BCG, but in M. smeamatis, psul3 was moreactive than pAN or pgroES/El1.

Das Gupta and his colleagues (47) screened M. smegmatis and M.tuberculosis DNA libraries for the promoter activity in M. smegmatis.They reported a promoter frequency 10 to 20 times higher in the M.smegmatis DNA. Furthermore, very active promoters were more rare in theM. tuberculosis DNA libraries than in those of M. smegmatis. Theseauthors suggested that the M. tuberculosis promoters may have divergedconsiderably from those of M. smegmatis. The results presented heresuggest that the transcriptional machinery of M. smegmatis and of M.bovis BCG, a species closely related to M. tuberculosis, may bedifferent.

In conclusion, the family of vectors constructed facilitates the studyof the expression of genes in mycobacteria. A wide range of fragmentsmay be easily cloned in phase with lacZ′ (fusion of genes) or upstreamof cII-lacZ (fusion of operons) and evaluated for the promoter activityby blue-white screening of mycobacterial transformants on LB-Xgalplates. The activity of these promoters may also be measured (byassaying the β-galactosidase activity), their sequences determined, andtheir site for initiation of transcription mapped (by primer extensionanalysis) using the “universal primer” or related sequences (53) asprimer.

IV) Expression of the ERP Protein in Recombinant form in E. coli

The ERP protein was expressed in recombinant form in E. coli andpurified by affinity chromatography. Two types of fusions between ERPand peptide fragments having a high affinity for specificchromatographic supports (Amylose, MalE system; chelated Nickel (Ni²⁺) ,for the Histidine system) were carried out. They are:

ERP lacking its signal sequence fused at the C-ter with themaltose-binding protein (MalE) of E. coli (MalE-ERP);

ERP lacking its signal sequence (ERP(His)₆ ss) or in its entirety(ERP(His)₆), and possessing 6 C-ter Histidine amino acids.

After purification, analysis of these three fusion proteins by SDS-PAGEelectrophoresis indicates that the ERP polypeptide possesses a relativemolecular weight (MW) of 36 kDa. There is a major difference between theMW calculated from the sequence (28 kDa) and the MW observedexperimentally (36 kDa). This delay in the electrophoretic migrationcould be due to the high content of Proline residues, or from posttranslational modifications.

REFERENCES

1. Altschul, S. F. et al., 1990, J. Mol. Biol., 215: 403-410.

2. Andersen, P. et al., 1991, Infect. Immun. 59: 1905-1910.

3. Andersen, P. et al., 1991, Infect. Immun. 59: 1558-1563.

4. Andersen, P. et al., 1994, Immun. 62: 2536-2544.

5. Ashbridge, K. R. et al., 1989, Nucl. Acid. Res. 17: 1249.

6. Boquet, P. et al., 1987, J. Bacteriol. 169: 1663-1669.

7. Borremans, M. et al., 1989, Infect. Immun. 57: 3123-3130.

8. Brockman, R. W. et al., 1968, Biochemistry 7: 2554-2561.

9. Cherayil, B. et al., 1988, J. Immunol. 12: 4370-4375.

10. Gaillard, J.-L. et al., 1991, Cell 65: 1127-1141.

11. Gutierrez, C. et al., 1989, Nucl. Acids. Res. 17: 3999.

12. Harris, D. P. et al., 1991, J. Immunol. 147: 2706-2712.

13. Hoffman, C. S. et al., 1985, Proc. Natl. Acad. Sci. USA 82:5107-5111.

14. Isberg, R. R. et al., 1987, Cell 50: 769-778.

15. Knapp, S. et al., 1988, J. Bact. 170: 5059-5066.

16. Manoil, C. et al., 1990, J. Bacteriol. 172: 515-518.

17. Miller, V. L. et al., 1987, cell. 48: 271-279.

18. Minton, N. P., 1984, Gene. 31: 269-273.

19. Nagal, S. et al., 1991, Infect. Immun. 59: 372-382.

20. Orme, I. M., 1988, Infect. Immun. 56: 3310-3312.

21. Orme, I. M. et al., 1993, J. Infect. Disea. 167: 1481-1497.

22. Pearce, B. J. et al., 1993, Mol. Microbiol. 9: 1037-1050.

23. Pearson, W. R. et al., 1988, Proc. Natl. Acad. Sci. USA. 85:2444-2448.

24. Prentki, P. et al., 1984, Gene. 29: 303-313.

25. Pugsley, A. P., 1993, Microbiol. Rev. 57: 50-108.

26. Ranes, L. G. et al., 1990, J. Bacteriol. 172: 2793-2797.

27. Sambrook, J. et al., 1989. Molecular Cloning: a Laboratory Manual,2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

28. Sanger, F. et al., 1977, Proc. Natl. Acad. Sci. USA 74: 5463-5467.

29. Shanklin, J. et al., 1991, Proc. Natl. Acad. Sci. USA 88: 2510-2514.

30. Snapper, S. B. et al., 1990, Mol. Microbiol. 11: 1911-1919.

31. Stover, K. C. et al., 1993, J. Exp. Med. 178: 197-209.

32. Taylor, R. K. et al., 1987, Proc. Natl. Acad. Sci. USA 84:2833-2837.

33. Taylor, R. K. et al., 1989, J. Bact. 171: 1870-1878.

34. Timm, J. et al., 1994, Mol. Microbiol. 12: 491-504.

35. Wiker, H. G. et al., 1992, Microbiol. Rev. 56: 648-661.

36. Winter, N. et al., 1991, Gene. 109: 47-54.

37. Yamaguchi, R. et al., 1989, Infect. Immun. 57: 283-288.

38. Young, D. B. et al., 1992, Mol. Microbiol. 6: 133-145.

39. Zhang, Y. et al., 1991, Mol. Microbiol. 5: 381-391.

40. Hollingstead S. et al., 1986, J. Biol. Chem. 262: 1677-1686.

41. Zavala, F. et al., J. Exp. Med. 157: 194-1957.

42. Barletta, R. G. et al., 1992, J. Gen. Microbiol. 138: 23-30.

43. Baulard, A. et al., 1992, Nucleic Acids Res. 20: 4105.

44. Brown, A. et al., 1987, J. Infect. Dis. 155: 86-92.

45. Casabadan, M. J. et al., 1980, J. Bacteriol. 143: 971-980.

46. Clark-Curtiss, J. E. et al., 1985, J. Bacteriol. 161: 1093-1102.

47. Das Gupta, S. K. et al., 1993, J. Bacteriol. 175: 5186-5192.

48. Garcia-del-Portillo, F. et al., 1992, Mol. Microbiol. 6: 3289-3297.

49. Guglielmi, G. et al., 1993, Basic and Applied Genetics. AmericainSociety for Microbiology, Washington, D.C.

50. Hatfull, G. H. et al., 1993. Genetic transformation of mycobacteria.TIM 1: 310-314.

51. Kieser, T. et al., 1986, J. Bacteriol. 168: 72-80.

52. Martin, C. et al., 1990, Nature 345: 739-743.

53. Messing, J., 1983, New M13 vectors for cloning, p.20-78. In R. Wu,L. Grossman and K. Moldave (eds.), Methods in Enzymology. AcademicPress, New York.

54. Miller, J. H., 1991, Bacterial Genetic Systems, In J. N. Abelson andM. I. Simon (eds.), Methods in Enzymology, Academic Press, San Diego.

55. Minton, N. P., 1984, Gene 31: 269-273.

56. Murray, A. et al., 1992, Mol. Microbiol. 6: 3331-3342.

57. Prentki, P. et al., 1984, Gene 29: 303-313.

58. Ranes, M. G. et al., 1990, J. Bacteriol. 172: 2793-2797.

59. Sambrook, J. et al., 1989, Molecular cloning: a laboratory manual,2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

60. Sirakova, T. D. et al., 1989, FEMS Microbiol. Lett. 59: 153-156.

61. Snapper, S. B. et al., 19[illegible]0, Mol. Microbial. 4: 1911-1919.

62. Timm, J. et al. Unpublished data.

TABLE 1 Strain/Plasmid Relevant characteristics Reference E. coli supE44hsdR17 recA1 gyrA46 thi relA1 27 XL1-Blue lac⁻ F′ M. smegmatisHigh-transformant mutant of M. smegmatis 30 mc²155 ATCC607 pRR3 E.coli-mycobacteria shuttle vector 26 pPHO7 pUC derivative carrying atruncated phoA 11 gene pNM480 pUC derivative carrying a truncated lacZ18 gene pJEM11 E. coli-mycobacteria shuttle vector carrying this work atruncated phoA gene pLA71 pJEM11 in which has been cloned a 34, 1,364 bpfragment from blaF* this work pLA72 pJEM11 in which has been cloned a34, 1,550 bp fragment from blaF* this work pLA73 pJEM11 in which hasbeen cloned the 34, complete blaF* this work pExp410 pJEM11 in which hasbeen cloned part of the this work M. tuberculosis 19 kDa antigen genepExp53 pJEM11 in which has been cloned part of a this work M.tuberculosis gene similar to the M. leprae 28 kDa antigen gene pExp59pJEM11 in which has been cloned the signal this work sequence of a M.tuberculosis unidentified gene pExp421 pJEM11 in which has been cloned athis work M. tuberculosis gene encoding a protein with amino acidsmotives similar to desaturases

TABLE 1 Strain/Plasmid Relevant characteristics Reference E. coli supE44hsdR17 recA1 gyrA46 thi relA1 27 XL1-Blue lac⁻ F′ M. smegmatisHigh-transformant mutant of M. smegmatis 30 mc²155 ATCC607 pRR3 E.coli-mycobacteria shuttle vector 26 pPHO7 pUC derivative carrying atruncated phoA 11 gene pNM480 pUC derivative carrying a truncated lacZ18 gene pJEM11 E. coli-mycobacteria shuttle vector carrying this work atruncated phoA gene pLA71 pJEM11 in which has been cloned a 34, 1,364 bpfragment from blaF* this work pLA72 pJEM11 in which has been cloned a34, 1,550 bp fragment from blaF* this work pLA73 pJEM11 in which hasbeen cloned the 34, complete blaF* this work pExp410 pJEM11 in which hasbeen cloned part of the this work M. tuberculosis 19 kDa antigen genepExp53 pJEM11 in which has been cloned part of a this work M.tuberculosis gene similar to the M. leprae 28 kDa antigen gene pExp59pJEM11 in which has been cloned the signal this work sequence of a M.tuberculosis unidentified gene pExp421 pJEM11 in which has been cloned athis work M. tuberculosis gene encoding a protein with amino acidsmotives similar to desaturases

What is claimed is:
 1. Recombinant vector plasmid pJEM11 deposited atCNCM under the No. I-1375.
 2. Recombinant vector according to claim 1,which contains a coding sequence derived from the phoA gene that istruncated under conditions such that a polypeptide expressed by thissequence conserves the alkaline phosphatase activity.
 3. Recombinantvector selected from the group consisting of: pExp53 deposited at CNCMunder the No. I-1464; pExp59 deposited at CNCM under the No. I-1465;pExp410 deposited at CNCM under the No. I-1466; and pExp421 deposited atCNCM under the No. I-1467.
 4. Recombinant vector plasmid pIPX412deposited at CNCM under the No. I-1463.
 5. Recombinant vector thatreplicates in mycobacteria, wherein the vector consists essentially of:(A) a replicon that is functional in mycobacteria; (B) a selectablemarker; (C) a reporter cassette comprising 1) a multiple cloning site(polylinker), 2) a mycobacteria nucleotide sequence comprising genomicDNA or cDNA of a pathogenic mycobacterium, which is inserted into one ofthe cloning sites of the polylinker; 3) a transcription terminator thatis active in mycobacteria, and which is upstream of the polylinker, and4) a coding nucleotide sequence derived from a gene encoding a marker orreporter for expression and/or export and/or secretion of protein, thesaid coding nucleotide sequence lacking its initiation codon and itsregulatory sequences.
 6. Recombinant vector that replicates inmycobacteria, wherein the vector consists essentially of: (A) a repliconthat is functional in mycobacteria; (B) a selectable marker; (C) areporter cassette comprising 1) a multiple cloning site (polylinker), 2)a T4 coliphage terminator (tT4), which is upstream of the polylinker,and 3) a coding nucleotide sequence derived from a gene encoding amarker or reporter for expression and/or export and/or secretion ofprotein, the said coding nucleotide sequence lacking its initiationcodon and its regulatory sequences.
 7. Recombinant vector according toclaim 5 or 6 comprising, in one of the polylinker cloning sites, anucleotide sequence from a mycobacterium in which the presence ofregulatory sequences is being sought, making it possible, when thevector is integrated in a mycobacterium cellular host, to obtain theexport and/or secretion of an expressed polypeptide or protein productof said nucleotide sequence.
 8. Recombinant vector according to claim 5,wherein the mycobacteria nucleotide sequence is obtained by enzymaticdigestion of genomic DNA or cDNA of M. tuberculosis.
 9. Recombinantvector according to claim 8, wherein the M. tuberculosis DNA is digestedwith Sau3A.
 10. Recombinant vector according to claim 5, wherein themycobacterium is selected from the group consisting of M. africanum, M.bovis, M. avium and M. leprae.
 11. Process for screening nucleotidesequences derived from mycobacteria to determine the presence ofregulatory elements that control the expression of nucleic acidsequences in a cellular host, and/or control the export and/or secretionof polypeptide sequences in a cellular host, comprising the followingsteps: (A) providing digests of mycobacteria DNA by: 1) digestingmycobacteria DNA with at least one determined enzyme; or 2) synthesizingdigests in vitro by an amplification technique; (B) inserting thedigests of step (A) into a multiple cloning site of a recombinant vectorthat replicates in mycobacteria, wherein the vector comprises: (1) areplicon that is functional in mycobacteria; (2) a selectable marker;and (3) a reporter cassette comprising: a) the multiple cloning site(polylinker); b) a transcription terminator that is active inmycobacteria, and which is upstream of the polylinker; and c) a codingnucleotide sequence derived from a gene encoding a marker for expressionand/or export and/or secretion of protein, the said coding nucleotidesequence lacking its initiation codon and its regulatory sequences; (C)optionally, amplifying the digest contained in the vector; (D) insertingthe vector of step (B) or (C) into cellular hosts; (E) culturing thecellular hosts of step (D) in a medium allowing detection of the markerfor export and/or secretion which is contained in the vector; (F)detecting the cellular hosts that are positive for the expression of themarker for export and/or secretion (positive colonies); (G) isolatingDNA from the positive colonies; (H) inserting the DNA of step (G) into adetermined cell; (I) selecting the inserts contained in the vector thatallow clones positive for the marker for export and/or secretion to beobtained; (J) isolating the digests of mycobacteria sequences that arecontained in the inserts; and (K) characterizing the digests of step(J).
 12. Screening process according to claim 11, wherein themycobacteria DNA is derived from a pathogenic mycobacterium or anonpathogenic mycobacterium.
 13. Screening process according to claim12, wherein the pathogenic mycobacterium is selected from the groupconsisting of M. tuberculosis, M. bovis, M. avium, M. africanum, and M.leprae.
 14. Recombinant mycobacterium comprising a vector according toclaim 11, wherein the vector comprises a mycobacteria nucleotidesequence obtained by enzymatic digestion of genomic DNA or cDNA of apathogenic mycobacterium, which is inserted into one of the cloningsites of the polylinker of said vector.
 15. Recombinant mycobacteriumaccording to claim 14, wherein the mycobacterium is an M. smegmatisstrain.
 16. Recombinant mycobacterium according to claim 14, wherein themycobacterium is an M. bovis strain.
 17. Nucleotide sequence derivedfrom a gene encoding an exported M. tuberculosis protein, wherein saidnucleotide sequence is selected from the group consisting of: (A) asequence comprising SEQ ID NO:38 in FIG. 6A or SEQ ID NO:40 in FIG. 6Bor a sequence hybridizing under stringent conditions with said SEQ IDNO:38 or SEQ ID NO:40; (B) a sequence comprising said SEQ ID NO:38 orSEQ ID NO:40, which encodes an M. tuberculosis P28 protein having amolecular weight of about 28 kDa; (C) a sequence contained in said SEQID NO:38 or SEQ ID NO:40, which encodes a polypeptide recognized byantibodies directed against the M. tuberculosis P28 protein; (D) asequence comprising the regulatory sequences of the gene comprising saidSEQ ID NO:38 or SEQ ID NO:40; E) a sequence between nucleotides 1 and 72of said SEQ ID NO:38 or SEQ ID NO:40, which comprises a signal sequence;(F) a sequence between nucleotides 62 and 687 of said SEQ ID NO:38 orSEQ ID NO:40; and (G) a sequence between nucleotides 688 and 855 of saidSEQ ID NO:38 or SEQ ID NO:40.
 18. Composition for the in vitro detectionof an M. tuberculosis infection comprising a nucleotide sequencecontaining at least 9 nucleotides, which is derived from a sequenceaccording to claim 17, or a nucleotide sequence containing at least 9nucleotides and hybridizing, under stringent conditions, with M.tuberculosis DNA and not hybridizing, under the same conditions, with M.leprae DNA, this sequence being a DNA or RNA sequence, which is labeledwhere appropriate.
 19. Recombinant vector selected from the groupconsisting of plasmids pJEM12, pJEM13, pJEM14, and pJEM15. 20.Recombinant vector according to claim 5 or 6, comprising a sequence of apromoter at one of the cloning sites of the polylinker.
 21. Screeningprocess according to claim 11, wherein the vector used is pJEM11 (CNCM1-1375) and the digestion of the mycobacteria DNA is performed withSau3a.